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Chapter  8 

Background  error  correlation  modeling  with 
diffusion  operators 

Max  Yaremchuk,  Matthew  Carrier,  Scott  Smith  and  Gregg  Jacobs 


Abstract  Many  background  error  correlation  (BEC)  models  in  data  assimilation  are 
formulated  in  terms  of  a  positive-definite  smoothing  operator  B  that  is  employed 
to  simulate  the  action  of  correlation  matrix  on  a  vector  in  state  space.  In  this  chap¬ 
ter,  a  general  procedure  for  constructing  a  BEC  model  as  a  rational  function  of  the 
diffusion  operator  D  is  presented  and  analytic  expressions  for  the  respective  corre¬ 
lation  functions  in  the  homogeneous  case  are  obtained.  It  is  shown  that  this  class  of 
BEC  models  can  describe  multi-scale  stochastic  fields  whose  characteristic  scales 
can  be  expressed  in  terms  of  the  polynomial  coefficients  of  the  model.  In  particular, 
the  connection  between  the  inverse  binomial  model  and  the  well-known  Gaussian 
model  B?  =  exp  D  is  established  and  the  relationships  between  the  respective  decor¬ 
relation  scales  are  derived. 

By  its  definition,  the  BEC  operator  has  to  have  a  unit  diagonal  and  requires  ap¬ 
propriate  renormalization  by  rescaling.  The  exact  computation  of  the  rescaling  fac¬ 
tors  (diagonal  elements  of  B )  is  a  computationally  expensive  procedure,  therefore 
an  efficient  numerical  approximation  is  needed.  Under  the  assumption  of  local  ho¬ 
mogeneity  of  D,  a  heuristic  method  for  computing  the  diagonal  elements  of  B  is 
proposed.  It  is  shown  that  the  method  is  sufficiently  accurate  for  realistic  applica¬ 
tions,  and  requires  102  times  less  computational  resources  than  other  methods  of 
diagonal  estimation  that  do  not  take  into  account  prior  information  on  the  structure 
of  B. 
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8.1  Introduction 

In  recent  years,  heuristic  background  error  correlation  (BEC)  modelling  has  become 
an  area  of  active  research  in  geophysical  data  assimilation.  Of  particular  interest  are 
the  BEC  models  constructed  with  positive  functions  of  the  diffusion  operator. 


D  =  VvV 


(8.1) 


where  v  is  the  spatially  varying  positive-definite  diffusion  tensor.  This  type  of  BEC 
model  is  attractive  for  several  reasons:  a)  it  guarantees  positive  definiteness  of  the 
resulting  correlation  functions  (CFs),  b)  it  is  computationally  inexpensive  in  most 
practical  applications,  and  c)  it  allows  straightforward  control  of  inhomogeneity 
and  anisotropy  via  the  diffusion  tensor.  In  the  traditional  approach  of  correlation 
modeling  where  spatial  correlations  are  specified  by  prescribed  analytical  functions, 
care  should  be  taken  to  maintain  positive  definiteness  of  the  respective  correlation 
operator,  especially  in  anisotropic  and/or  inhomogeneous  cases  [1,2]. 

Among  the  most  popular  operators  B  used  in  practical  BEC  modeling  are  those 
using  the  exponential  and  the  inverse  binomial  functions  of  D: 


where  I  is  the  identity  operator,  a  is  a  scaling  parameter  and  m  is  a  positive  integer. 
Since  D  has  a  non-positive  spectrum  whose  larger  eigenvalues  correspond  to  the 
smaller-scale  eigenvectors,  the  operators  B„  and  Bm  are  positive-definite  and  sup¬ 
press  small-scale  variability.  Both  types  of  BEC  models  (8.2)  are  extensively  used 
in  geophysical  applications.  Numerically,  they  are  implemented  by  integration  of 
the  diffusion  equation  using  either  explicit  (in  the  case  of  B?  [3,  4,  5])  or  implicit 
(in  the  case  of  Bm  [6,  7])  integration  schemes. 

A  disadvantage  of  the  BEC  models  (8.2)  is  that  there  is  a  limited  freedom  in  the 
shape  of  local  CFs,  which  have  either  the  shape  of  the  Gaussian  bell  (B?)  or  provide 
its  mth-order  strictly  positive  approximations  (B,„)  [8,  9].  In  order  to  allow  negative 
correlations,  one  has  to  consider  operators  generated  by  the  arbitrary  polynomials  in 
D.  The  quadratic  polynomial  case  was  studied  recently  by  Hristopulos  and  Elogne 
[10,  11]  and  Yaremchuk  and  Smith  [9],  who  obtained  analytic  representations  of  the 
CFs  and  derived  relationships  between  the  polynomial  coefficients  and  the  spectral 
parameters  of  B  in  the  homogeneous  case. 

In  a  more  realistic  inhomogeneous  setting,  the  diffusion  tensor  varies  in  space, 
making  analytic  methods  inapplicable.  Nevertheless,  they  can  still  give  a  reason¬ 
able  guidance  for  quick  estimation  of  the  diagonal  elements  of  B  (normalization 
factors),  whose  values  are  crucial  for  constructing  the  BEC  models.  The  importance 
of  accurately  computing  diagB  is  evident  from  the  fact  that  the  operators  B  under 
consideration  are  formulated  numerically  as  multiplication  algorithms  by  the  ma¬ 
trices,  whose  elements  are  not  explicitly  known.  On  the  other  hand,  since  the  BEC 
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operator  C  is  represented  numerically  by  the  correlation  matrix,  it  must  have  a  unit 
diagonal  and,  therefore,  knowledge  of  the  diagonal  elements  of  B  is  required  for 
renormalization: 

C  =  (diagB  p^B  (diagB)'1/2  (8.3) 

Equation  (8.3)  shows  that  the  considered  BEC  models  involve  two  separate  algo¬ 
rithms:  one  for  computing  the  action  of  B  and  another  for  estimating  the  normaliza¬ 
tion  factors  (diagB)  that  are  necessary  for  computing  the  action  of  (diagB)  - 'P 

Purser  with  coauthors  [12,  13,  14]  were  among  the  first  to  employ  analytic  meth¬ 
ods  for  estimating  the  normalization  factors  for  the  Gaussian  operator  B?  in  geo¬ 
physical  applications.  Somewhat  earlier,  an  asymptotic  technique  was  developed 
for  estimating  the  diagonal  of  the  Gaussian  kernel  in  Riemannian  spaces  to  study 
quantum  effects  in  general  relativity  (e.g.,  [15], [16]).  These  ideas  can  be  utilized  to 
derive  a  useful  algorithm  for  estimating  the  normalization  factors. 

In  this  chapter,  we  first  give  an  overview  of  the  recent  developments  in  con¬ 
structing  the  D-operator  BEC  models,  and  illustrate  their  major  features  with  the 
examples  in  the  homogeneous  case  v  =  const.  In  particular,  in  section  2.2,  the  rela¬ 
tionships  between  the  scaling  parameters  for  the  Gaussian  model  and  its  with-order 
approximation  (8.2)  are  obtained  and  the  respective  CFs  are  given.  In  section  2.3  the 
inverse  binomial  model  is  extended  to  an  arbitrary  polynomial  of  D:  Expressions  for 
the  CFs  and  normalization  factors  are  derived,  and  relationships  are  established  be¬ 
tween  the  structure  of  the  BEC  spectrum  and  the  polynomial  coefficients.  In  section 
3,  after  a  brief  overview  of  the  diagonal  estimation  methods,  a  heuristic  formula  for 
computing  diagB„  is  derived  (section  3.2)  and  then  tested  numerically  against  other 
methods  in  a  set  of  realistic  oceanographic  applications  (sections  3. 3-3. 5).  Results 
of  similar  tests  with  the  Bm  model  are  also  presented.  Summary  and  discussion  of 
the  prospects  for  the  D-operator  BEC  modeling  complete  the  chapter. 


8.2  Diffusion  operator  and  covariance  modeling 

The  convenience  of  the  diffusion  operator  (8.1)  for  constructing  the  BEC  models 
can  be  explained  by  the  non-negative  spectrum  of  D:  An  operator  that  is  gener¬ 
ated  by  a  positive  rational  function  F  of  —  D  whose  eigenvalues  tend  to  zero  at  large 
wavenumbers,  is  positive-definite  and  has  a  smoothing  property,  i.e.  tends  to  sup¬ 
press  high-frequency  components  of  the  solution.  In  this  section  we  consider  two 
types  of  such  functions:  Those  that  are  generated  by  the  with -order  binomials  (Sec¬ 
tion  2.2)  and  the  others  by  the  inverse  of  a  positive  polynomial  (Section  2.3).  To 
allow  analytical  treatment,  anisotropic  homogeneous  case  in  the  boundless  domain 
is  considered. 

The  benefit  of  analytical  consideration  is  its  ability  to  reveal  local  correlation 
structure  and  therefore  provide  a  reasonable  guidance  to  construction  of  more  gen¬ 
eral  operators  B.  In  addition,  as  it  has  been  shown  recently,  good  approximations  to 
diagB  can  be  obtained  by  using  analytical  results  obtained  with  the  homogeneous 
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versions  of  B  (e.g.,  [12,  17,  18]).  Therefore,  analytical  formulas  describing  homo¬ 
geneous  BEC  operators  are  of  significant  practical  interest.  The  analytical  results 
may  facilitate  practical  design  of  the  cost  functions  in  variational  data  assimilation 
problems,  because  they  give  explicit  relationships  between  the  shape  of  the  local 
CFs  and  the  structure  of  the  corresponding  BEC  operator. 


8.2.1  Correlation  functions  and  normalization 

Consider  an  anisotropic,  homogeneous  diffusion  operator  (8.1)  in  R",n  =  1, ...  ,3, 
with  x  G  R"  representing  points  in  the  physical  space.  By  using  the  coordinate  trans¬ 
formation  x'  =  v-1/2*,  the  problem  can  be  reduced  to  considering  isotropic  opera¬ 
tors  of  the  form 

B=F(-4),  (8.4) 

where  A  is  the  Laplacian  (e.g.,  [8,  10])  and  F  is  an  arbitrary  positive  function.  In  the 
case  of  an  inhomogeneous  diffusion  (v  f  const )  the  global  transformation  cannot  be 
found.  Transformations  of  this  type,  however,  can  be  used  locally  for  constructing  B 
and  the  normalization  factors  (Section  3).  All  of  the  formulas  that  are  written  below 
are  assumed  to  be  in  the  transformed  coordinates  x1  with  primes  omitted  to  simplify 
the  notation. 

The  operator  (8.4)  is  diagonalized  with  the  Fourier  transform,  and  the  diagonal 
elements  are  B(k )  =  F(k2)  where  k  is  the  Fourier  coordinate  (wavenumber).  Be¬ 
cause  of  homogeneity,  the  matrix  elements  of  B  in  the  x -representation  depend  only 
on  the  distance  r  =  |jc|  from  the  diagonal.  They  can  be  computed  by  applying  the 
inverse  Fourier  transform  to  B(k): 

B"(x)  =  (2ti)~n  I  B{k)  exp(—ikx)dk.  (8.5) 

R" 


By  integrating  over  the  directions  in  R"  (Appendix  1),  (8.5)  can  be  reduced  to 

oo 

B'\r)  =  (2  n)-n/2  J  B(k)kn-\kr)sJ-s(kr)dk  (8.6) 

o 

where  k  =  |£|,  J  denotes  the  Bessel  function  of  the  first  kind,  and  ,v  =  1  —n/2. 
The  respective  matrix  elements  of  the  correlation  operator  (CFs)  are  obtained  by 
normalization: 

Cn(r)=B'\r)/Bn(p)  (8.7) 

In  practical  applications,  the  diffusion  operator  is  not  homogeneous,  and  the  analytic 
representations  (8. 6-8. 7)  cannot  be  obtained.  However,  the  action  of  B  on  a  state 
vector  can  be  computed  numerically  at  a  relatively  low  cost.  The  major  problem 
with  such  modelling  is  the  efficient  estimation  of  the  diagonal  elements 
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B"(x,x)  =  f  B"(x,y)8{x-y)dy  (8.8) 

R" 

which  are  necessary  to  rescale  B  to  have  its  diagonal  elements  equal  to  unity.  In 
practice,  the  rescaling  factors  N”(x)  are  defined  as  reciprocals  of  B" (x.x). 

Computing  the  integral  (8.8)  numerically  is  expensive,  because  the  convolutions 
with  the  8 -functions  have  to  be  performed  at  all  points  x  of  the  numerical  grid. 
However,  reasonable  approximations  [12,  18]  for  N'’(x)  can  be  obtained  by  using 
asymptotic  expansions  of  (8.8)  under  the  assumption  of  weak  inhomogeneity  (see 
Section  3). 


8.2.2  The  Gaussian  model  and  its  binomial  approximations 


The  Gaussian-shaped  correlation  model  is  widely  used  in  geophysical  applications. 
Numerically,  it  is  implemented  by  approximating  exp(crD/2)  with  the  binomial: 


Bg(D)  =  exp( 


j2D 


a2  D 
2m 


(8.9) 


where  m  is  a  large  positive  integer.  This  numerical  approach  is  often  referred  to 
as  ’’integration  of  the  diffusion  equation”  and  has  been  used  in  practice  for  several 
decades  [3,  4,  5,  7],  There  is,  however,  a  certain  disadvantage  associated  with  the 
numerical  stability  of  the  integration:  The  number  of  “integration  time  steps”  m  has 
to  be  large  enough  for  the  eigenvalues  of  the  binomial  operator  in  the  rhs  of  (8.9) 
to  be  less  than  1  in  the  absolute  value.  This  constraint  may  limit  m  from  below  by  a 
large  value,  which  can  make  the  computation  rather  expensive. 

Another  option  is  to  use  a  different  approximation  in  (8.9): 


Bm(D) 


(8.10) 


The  eigenvalues  of  the  operator  in  the  rhs  of  (8.10)  do  not  exceed  1,  and  the  “inte¬ 
gration  procedure”  is  unconditionally  stable.  This  approach  is  often  referred  to  as 
“implicit  integration  of  the  diffusion  equation”  (see  Appendix  2).  and  has  been  used 
in  many  practical  applications  as  well  [6,  7,  19]. 

In  the  Fourier  representation  both  models  (8.9)  and  (8.10)  approximate  the  same 
Gaussian  function  of  k: 


B"(*) 

*%(*) 


1- 


a2k2 


2m 
a2k 2 


2m 


a2k2 

:  exP( - j-) 

a2k2 

-  exp( - — ) 


(8.11) 

(8.12) 


182 


Yaremchuk  et  al. 


Since  the  value  of  m  in  (8.11)  is  fairly  large  in  practice,  the  resulting  CF  is  hardly 
distinguishable  from  a  Gaussian-shaped  curve  with  a  half-width  a. 

Substituting  (8.12)  into  (8.6),  integrating  over  k,  and  normalizing  the  result  by 
b;;,(o)  yields  the  CFs  of  the  Matern  family  [20]  enumerated  by  s  =  m  —  n  / 2  and 
scaled  by  a*  =  a/ \fl m: 


Cnm(p ) 


P%(P) 

2s-1r(s)’ 


(8.13) 


where  p  =  r/a*,  F  is  the  gamma-function  and  K  stands  for  the  modified  Bessel 
function  of  the  second  kind  [22].  The  respective  normalization  factors  are 


Nn 


y/Jtr  (m) 
r(m—  1/2) 


(8.14) 


where  a>\  =  2,  fth  =  27T,and  (O3  =  47T.  In  the  limiting  case  of  m  — >  °°,  the  CFs  (8.13) 
take  the  Gaussian  form: 


C'l,  =  exp(— r2/2a2);  n=  1,..  (8.15) 

Consecutive  approximations  of  the  Gaussian  CF  by  (8.13)  are  shown  in  Figure  1.  It 
is  remarkable  that  when  m  =  1,  the  CFs  (8.13)  have  singularities  at  p  =  0  in  both  two 
and  three  dimensions  (see  also  Table  1).  This  means  that  in  the  continuous  case  the 
first-order  approximations  become  invalid  when  n  >  1 .  Numerically,  however,  the 
CFs  do  exist  for  n  >  1,  but  their  decorrelation  scale  is  limited  by  the  grid  size  8  (the 
corresponding  CF  is  shown  by  the  dotted  line  in  the  left  panel  of  Fig.  1).  This  occurs 
because  the  numerical  analogue  of  the  5-function  is  never  singular,  but  has  a  finite 
amplitude  inversely  proportional  to  the  volume  of  a  grid  cell,  therefore,  resulting  in  a 
finite  value  of  the  convolution  (8.8)  even  if  it  is  infinite  in  the  continuous  case.  After 
normalization  by  that  finite  value,  the  CF  is  1  at  r  =  0,  but  its  effective  decorrelation 
scale  remains  proportional  to  the  local  grid  size. 

The  left  panel  in  Figure  1  shows  that  low-order  binomial  approximations  (8.13) 
underestimate  the  decorrelation  scale  a  of  the  target  Gaussian  function.  This  un¬ 
pleasant  property  can  be  corrected  by  optimizing  the  value  of  a  in  (8.10)  to  obtain 
the  best  fit  with  the  Gaussian  CF.  Since  the  Gaussian  and  its  approximating  func¬ 
tions  are  both  positive  and  have  similar  shapes,  a  reasonable  optimization  criterion 
is  to  set  their  integral  decorrelation  scales  equal  to  each  other: 

jcm(p)dr=^^jcm(y)dy=jex  (8'16) 

0  00 

Expression  (8.16)  shows  that  aopt  =  where  the  rescaling  coefficient  q",  is  de¬ 
fined  as: 

/  Cnm(y)dy 
0 


ns) 


r(s  + 1/2) 


&  =  y/jm 


(8.17) 
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Fig.  8.1  Left:  Binomial  approximations  (8.13)  of  the  Gaussian  CF  in  two  dimensions  ( n  =  2). 
The  CF  for  m  =  1  is  shown  by  the  dotted  line  for  the  numerical  realization  with  the  grid  step 
8  =  a/4.  Middle:  Same  approximations,  but  with  optimally  adjusted  correlation  radii  for  various 
combinations  of  m  and  n.  Right:  Differences  between  the  Gaussian  CF  and  its  approximations 
shown  in  the  middle  panel.  The  horizontal  axes  are  scaled  by  a. 


The  values  of  for  m.n  <  4  and  their  respective  approximation  errors 

4  =  llC-C.It/r/lllC.lt/r] 

o  o 

are  assembled  in  Table  1 . 


Table  8.1  Correlation  functions  associated  with  the  power  approximations  (8. 10)  of  the  Gaussian 
CF  in  n  dimensions.  The  CFs  for  n  =  1  and  3  are  rewritten  in  terms  of  elementary  functions 
for  convenience.  The  correlation  radius  adjustment  coefficients  <§"  are  shown  below  the  formulas 
together  with  the  corresponding  relative  errors  eJJ,  in  approximation  of  the  Gaussian  CF  (bold 
numbers). 


n  =  1 

n  =  2 

71  —  3 

m  =  1 

exp(-p) 
yjn  0.33 

Ko(P ) 

e*P(-p)/P 

m  =  2 

(l+p)exp(— p) 

sjlijl  0.13 

pKi(p) 
v/8 Jn  0.19 

exp(— p) 

ViK  0.33 

m  =  3 

(l+p+p2/3)exp(-p) 

y/zTn/S  0.08 

~pWpW~ 

^16/1,71  0.10 

(l+p)exp(— p) 
•^/3tt/4  0.13 

The  coefficients  <*”  along  with  relationship  (8.12)  provide  an  expression  for  esti¬ 
mating  the  scaling  parameter  in  the  binomial  model  (8.10)  which  approximates  the 
Gaussian-shaped  CF  with  a  given  radius  a: 


Q-binom  —  V^2/n 


(8.18) 
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8.2.3  The  inverse  polynomial  model 

A  certain  disadvantage  of  the  binomial  models  (8.9)  and  (8.10)  is  their  inability  to 
represent  oscillating  CFs  whose  spectra  may  have  multiple  maxima.  This  issue  can 
be  overcome  by  considering  the  BEC  models  of  the  form. 

-l 

(8.19) 

Here  aj  are  the  real  numbers,  constrained  by  the  positive  definiteness  requirement 
of  B.  In  the  Fourier  representation,  the  operator  (8.19)  acts  as  multiplication  by  the 
inverse  of  the  polynomial  in  k 2,  and  the  positive-definiteness  property  translates  into 
the  requirement  that  the  spectral  polynomial 

J 

B-l{k2)  =  l  +  Yjaj(-k2y  (8.20) 

j=  1 

to  be  positive  for  all  k2  >  0.  This  constraint  is  equivalent  to  the  statement  that  the 
rhs  of  (8.20)  must  not  have  real  positive  roots.  Therefore,  B~l(k2)  can  also  be  rep¬ 
resented  in  the  form 


B 


j 

E< 


,D' 


i  M 

B_1(^2)  =  7  n  {k2+z2m)[k2+z2m),  (8.21) 

Z  m=  1 

where  M  =  7/2, 

z=rKi2>  <8-22) 

m 

the  overline  denotes  the  complex  conjugate,  and  zm  =  +  ibm  are  arbitrary  complex 

numbers  with  1  m (z2n )  ^  0.  In  its  general  form,  the  polynomial  (8.21)  is  additionally 
multiplied  by  the  product  of  the  arbitrary  number  of  real  negative  roots  (bm  =  0). 
The  ensuing  analysis  of  (8.21)  will  be  simplified  by  omitting  the  product  (summa¬ 
tion)  limits  over  m  and  assuming  there  are  no  real  negative  or  multiple  roots.  The 
latter  requirement  is  not  restrictive  in  practice,  because  location  of  the  roots  is  never 
known  exactly,  and  the  BEC  spectrum  can  always  be  well  approximated  by  (8.21) 
[21]. 

It  is  instructive  to  note  that  the  polynomial  (8.21)  can  also  be  rewritten  as 

i  M 

B-\k2)  =  7  IK ai+(k-bm)2)(a2m  +  (k  +  bm)2),  (8.23) 

Z  m=  1 

Compared  to  the  spectral  representation  (8.20),  representation  (8.23)  has  the  advan¬ 
tage  that  its  free  parameters  are  not  constrained  by  the  positive-definiteness  require- 
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ment,  and  they  have  a  sensible  meaning  of  the  scales  (b  1 )  and  ’’energies”  (a-1)  of 
the  modes  forming  the  spectrum. 

Using  equation  (8.6),  the  matrix  elements  of  B  can  now  be  written  as 


pn(,  Zr-S  7  k'+'js(kr)dk 

(2n)d  FI (k2+z2m){k2+z}nY 

0  m 

where  s  =  n/ 2—1.  The  integral  in  (8.24)  can  be  taken  by  decomposing 

Z 

m  =  n  (*2+4)(*2+4) 

m 


into  elementary  fractions: 


m 


Qm  Qm 

k2+zi  k2+z2my 


where 


Q  m  — 


(4  -  4)  n  (4  -  z2)  (4  -  z)) 

j¥=m 


(8.24) 


(8.25) 


(8.26) 


(8.27) 


After  substitution  of  (8.26)  into  (8.24),  the  integral  is  reduced  to  the  sum  of 
Hankel-Nicholson  type  integrals  [22]  and  can  be  taken  explicitly,  yielding 


(2  ny  m 

(8.28) 

where  pm  =  zmr,  and  angular  brackets  denote  taking  the  real  part  (cf.  eq.  (8.13)). 

The  corresponding  correlation  functions  C"  (r)  are  obtained  through  normalizing 
(8.28)  by  Bn( 0).  The  first  three  values  at  r  =  0  are 

Bl{0)  =  Y^^lmZm)\Zm\  2 

m 

(8.29) 

B2{  0)  =--Y{qm\0gZm) 

(8.30) 

0)  =  — 

^  m 

(8.31) 

The  normalization  factors  can  be  found  by  integrating  C"(r)  over  M": 

-  y  (‘ 7»i^m ) 

B"(0)L  |z„,|4 


N' 


(8.32) 
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Relationships  (8.28)-(8.32)  provide  analytical  expressions  for  the  CFs  and  the 
normalization  factors. 


Fig.  8.2  Two-parameter  CFs  corresponding  to  the  inverse  BEC  (8.21)  with  a  =  1  ,M  =  1.  The 
horizontal  axis  is  scaled  by  a.  Dotted  lines  show  CFs  corresponding  to  the  special  case  with  two 
negative  roots  k\  =  —a,  k\  =  —b  not  described  by  the  spectral  polynomial  (8.21). 


In  the  important  case  of  the  quadratic  polynomial  ( M  =1)  the  BEC  model  is 
defined  by  two  parameters  a,b  (Figure  2).  Expressions  for  the  respective  CFs  in 
1-  and  3-dimensional  cases  can  be  rewritten  in  terms  of  the  elementary  functions 
[9,21] 


,  \J  a2  +b2  a , 

C  ( a,b,r )  = - exp(— ar)  cos {br  —  arctan-) 

b  b 


C3(a,b,r)  =  exp(— ar) 
and  the  normalization  factors  are  given  by 
4  a  o  87 lab 


sin  (br) 
br 


N  =  ■ 


i2  +b2’ 


N2  = 


2{a2  +  b2)2arctan(Z?/a)  ’ 


N3  = 


87 la 


(a2  +b2)2 


(8.33) 

(8.34) 


(8.35) 


In  practical  applications,  a  BEC  model  is  often  constructed  by  fitting  the  spec¬ 
tral  (8.25)  or  correlation  (8.28)  functions  to  those  derived  from  experimental  data. 
These  functions  are  characterized  by  2m  parameters  which  give  enough  freedom  for 
approximating  complex  spectra.  The  approximation  procedure  can  be  formulated  as 
a  least  squares  problem  in  2m  dimensions,  which  may  be  rather  difficult  to  solve 
due  to  the  non-linearity  of  B  with  respect  to  the  fitting  parameters  am  and  bm.  There¬ 
fore,  it  is  useful  to  have  guidance  on  how  the  BEC  model  parameters  are  related  to 
the  scales  and  amplitudes  of  the  physical  modes  that  contribute  to  the  experimental 
spectrum  (Fig.  3). 

The  contribution  of  the  mth  mode  to  the  spectrum  can  be  assessed  by  integrating 
the  right  hand  side  of  (8.26): 
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Qm 

k2+zi 


Rm 

k2+Zfn_ 


_ 7Z  (tf  mZm) 

\Zm\2 


(8.36) 


In  the  limit  when  distances  \bi  —  bm\  between  the  spectral  peaks  of  B  are  much 
larger  than  their  half-widths  am,  (i.e.  am/bm  <C  0  in  particular),  equation  (8.36)  can 
be  simplified  using  the  asymptotic  approximations 

Zm-ibm;  nm=l\(l-b2Jb )f 


to  yield 


nbi 


Asymptotic  values  of  the  spectral  density  at  the  peaks  are  respectively 


B(bm) 


Aa~nYlm  zia,n 


(8.37) 


(8.38) 


i.e.  the  peak  amplitudes  are  inversely  proportional  to  b2n  and  to  the  square  of  the 
mode  scale  a Expressions  (8.36)-(8.38)canbe  useful  in  generating  the  first  guess 
values  for  zm  to  initialize  an  iterative  procedure  of  approximating  experimental  data. 


Fig.  8.3  An  example  of  the  normalized  spectrum  (left)  and  the  respective  correlation  function 
(right)  for  the  fourth-order  polynomial  (8.26)  in  two  dimensions.  (M  =  2;  z\  =  .5  +  3 Z2  = 
.2  +  61). 

After  the  model  parameters  are  established,  the  action  of  B  1  can  be  computed 
recursively  (cf.  equations  (8.21)-(8.22): 

B-^nP-i^r2^)'-0)] 


(8.39) 
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The  inverse  BEC  model  (8.39)  can  then  then  be  employed  to  compute  either  the 
action  of  B  with  an  iterative  inversion  algorithm  or  to  directly  compute  the  gradient 
of  a  3dVar  cost  function  involving  the  quadratic  form  XTB  'x,  where  X  is  the  state 
vector. 

The  above  analysis  gives  an  insight  on  the  shape  of  the  local  CFs  and  provides  a 
direct  connection  between  the  scales  described  by  B  and  the  polynomial  coefficients 
of  the  considered  BEC  models  (8.9),  (8.10),  (8.25)  or  (8.39).  The  second  important 
ingredient  in  constructing  the  BEC  operator  C  (eq.  (8.3))  is  estimating  the  diagonal 
elements  of  B,  which  is  a  more  technical  but  equally  important  problem. 


8.3  Diagonal  estimation 
8.3.1  Stochastic  methods 

In  the  last  few  decades  a  large  family  of  stochastic  algorithms  were  developed  for 
estimating  elements  and  traces  of  extra-large  matrices  emerging  from  numerical 
soluitons  of  the  PDEs  in  applied  physics  (e.g.,  [23,  24,  25]).  Weaver  and  Courtier 
[27]  were  among  the  first  to  use  this  approach  in  geophysical  applications  for  esti¬ 
mating  the  diagonal  of  the  operator  (8.9). 

The  underlying  idea  is  to  define  an  ensemble  of  K  random  vectors  S/,  on  the 
model  grid  and  perform  componentwise  averaging  of  the  products  S  =  Bs  according 
to  the  formula: 

d(x)  =  S0S0S0S,  (8.40) 

where  the  overline  denotes  averaging  over  the  ensemble  and  ©,  0  stand  for  the 
componentwise  multiplication  and  division  of  the  vectors  respectively.  Simple  con¬ 
siderations  show  that  when  all  the  components  of  S  have  identical  5-correlated  dis¬ 
tributions  with  zero  mean,  the  contributions  to  d  from  the  off-diagonal  elements 
tend  to  cancel  out,  and  d  converges  to  d  =  diagB  as  K  — >  °°.  More  accurately,  the 
squared  relative  approximation  error 

e2{x)  =  (d-d)2/d2  (8.41) 

is  inversely  proportional  to  the  ensemble  size  K.  In  other  words,  one  may  expect  to 
achieve  10%  accuracy  at  the  expense  of  approximately  100  multiplications  by  B  if 
the  first  ensemble  member  gives  a  100%  error.  This  estimate  may  seem  acceptable 
since  in  geophysical  applications  the  BE  variances  are  usually  known  with  limited 
precision  and  approximating  the  diagonal  with  5-10%  error  seems  satisfactory. 

The  above  described  Monte-Carlo  (MC)  technique  was  developed  further  by 
Bekas  et  al  [26],  who  noticed  that  the  method  may  converge  to  d  in  the  finite  number 
of  iterations  that  equals  to  the  matrix  dimension  N  if  the  ensemble  vectors  are  mu¬ 
tually  orthogonal.  An  easy  way  to  construct  such  an  ensemble  is  to  draw  the  vectors 
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S/v  from  the  columns  of  the  N/N  Hadamard  matrix  (HM),  which  span  the  model’s 
state  space  (see  Appendix  3). 

In  the  numerical  experiments  below  we  use  MC  and  HM  techniques  as  testbeds 
for  the  diagonal  estimation  methods  which  can  be  derived  from  analytical  consider¬ 
ations  and  are  exploit  prior  knowledge  of  the  structure  of  B. 


8.3.2  Locally  homogeneous  approximations 


Consider  homogeneous  (v  =  const)  operators  (8.2)  with  a2  =  1/2  and  assume  that 
the  coordinate  axes  are  aligned  along  the  eigenvectors  of  the  diffusion  tensor,  whose 
(positive)  eigenvalues  are  X2 . i  =  ]....«.  Then  the  matrix  elements  of  B,,,„  can  be 
written  down  explicitly  as 


=  exp(D/2)  =  d  exp  — 
Bm(x,y)  =  (\-D/2m)-m=d^^ 


(8.42) 

(8.43) 


where 

P  =  l/(*-J’)Tv-1(*-j’) 

is  the  distance  between  the  correlated  points  (measured  in  terms  of  the  smoothing 
scales  Xi),  d  =  (27t)~n^2£2^1  are  the  (constant)  diagonal  elements  of  B?  m,  £2  = 
nXj  =  Vdetv  is  the  diffusion  volume  element,  and  p  =  \/2mp. 

When  v  varies  in  space,  (8.42-8.43)  are  no  longer  valid,  and  the  diagonal  ele¬ 
ments  d  depend  on  x  and  the  type  of  the  B  operator.  However,  if  we  assume  that  v 
is  locally  homogeneous  (LH),  i.e.  varies  in  space  on  a  typical  scale  L  which  is  much 
larger  than  X the  diagonal  elements  d(jt)  can  be  expanded  in  the  powers  of  the 
small  parameter  £  =  X/L,  where  X  is  the  mean  eigenvalue  of  >Jv.  The  zeroth-order 
LH  approximation  term  (LHO)  is  apparently 

d°(jc)  =  {2n)~nl2£2{xyx  (8.44) 


because  for  infinitely  slow  variations  of  v  (L  — >  °°),  the  normalization  factors  must 
converge  to  the  above  expression  for  the  constant  diagonal  elements  d.  It  is  note¬ 
worthy  that  the  formula  (8.44)  is  found  to  be  useful  even  in  the  case  of  strong  inho¬ 
mogeneity  £  >  1.  In  particular,  numerical  experiments  of  Mirouze  and  Weaver  [17] 
have  shown  that  such  an  approximation  provided  10%  errors  in  a  simplified  Id  case. 

The  accuracy  of  (8.44)  can  formally  be  increased  by  considering  the  next  term  in 
the  expansion  of  the  diagonal  elements  of  B^  m.  The  technique  of  such  asymptotics 
has  been  well  developed  for  the  diagonal  of  the  Gaussian  kernel  (8.42)  in  Rieman- 
nian  spaces  (e.g.,  [15,  16]).  More  recently,  the  approach  was  considered  by  Purser 
[13,  14]  in  the  atmospheric  data  assimilation  context.  The  application  of  this  tech- 
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nique  to  the  diffusion  operator  (8.1)  in  flat  space  yields  the  following  asymptotic 
expression  for  the  diagonal  elements  of  B,,  in  the  local  coordinate  system  where 
v(x)  is  equal  to  the  identity  matrix,  and  D  takes  the  form  of  the  Laplacian  operator: 


Bg(*,x)  = 


(27t)”/2 


1-itr *-l 
2  12 


-tr/i  +  V  •  div/i 


-0(e5) 


(8.45) 


Here  li  is  a  small  (\h\  ~  e)  correction  to  v  within  the  vicinity  of  x.  Note  that  the 
terms  in  the  parentheses  have  the  order  0(e3),  because  each  spatial  differentiation 
adds  an  extra  power  of  e. 

The  asymptotic  estimate  (8.45)  involves  second  derivatives  which  tend  to  amplify 
errors  in  practical  applications  when  e  may  not  be  small.  Therefore,  using  (8.45)  in 
its  original  form  could  be  inaccurate  even  at  a  moderately  small  value  of  e.  To 
increase  the  computational  efficiency,  it  is  also  desirable  to  formulate  the  first-order 
approximation  as  a  linear  operator,  which  acts  on  d°(x).  Keeping  in  mind  that  \h  \  ~ 
e,  and  utilizing  the  relationships: 


d°(jc)  =  (2n)~n/2£2(x)~] 


(2? zyni2 


(8.46) 


exp(4/2)«l  +  i4,  (8.47) 

the  second  term  in  the  parentheses  of  (8.45)  can  be  represented  as  follows: 

V-  di  vh  =  -Atrh  +  V-  div/i'  (8.48) 

n 

where  li'  is  the  traceless  part  of  It.  On  the  other  hand,  if  the  divergence  of  li'  is 
neglected,  the  equation  (8.45)  can  be  rewritten  in  the  form 

(* +r4)  0  -  f''')  <8-49) 

where 

7«  —  7  +  •  (8-50) 

6  3  n 

Taking  (8.46-8.47)  into  account  and  replacing  A  by  D,  the  desired  ansatz  for  the 
first-order  approximation  (LH1)  of  the  diagonal  elements  is  obtained: 

d*=exp^y„|0d°  (8.51) 


The  relationship  (8.51)  was  derived  by  Purser  et  al.  [12]  for  the  one-dimensional 
case  (yi  =  0.5)  and  tested  by  Mirouze  and  Weaver  [17],  who  reported  a  significant 
(2-4  times)  improvement  of  the  accuracy  in  Id  simulations. 
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An  estimate  similar  to  (8.51)  can  also  be  obtained  for  Bm,  possibly  with  a  dif¬ 
ferent  coefficient  y„.  It  is  assumed,  however,  that  y,  may  not  differ  too  much  from 
y„  given  similarity  in  the  shapes  (Fig.  1)  of  the  correlation  functions  (8.42)-(8.43). 
Furthermore,  because  of  the  approximate  nature  of  (8.51),  the  best  representation  of 
d(jc)  in  realistic  applications  may  be  achieved  with  a  value  of  y„  that  ts  significantly 
different  from  the  one  given  by  (8.50).  For  this  reason,  a  more  general  form  of  (8.51) 
was  adopted  in  the  numerical  experiments,  assuming 

dg(x)  ~  exp[yD/2]d°(x);  dj(x)  «  [I  -  yD/4]~2d^(x)  (8.52) 

for  the  Gaussian  model  and  its  second-order  (m  =  2)  spline  approximation  (10). 

The  following  experiments  investigate  the  dependence  of  the  respective  approx¬ 
imation  errors  (£02)  °n  the  free  parameter  y. 


Fig.  8.4  Left:  A  composite  map  of  five  columns  of  the  Be  operator.  White  circles  denote  locations 
of  the  diagonal  elements  of  the  corresponding  correlation  matrices.  Right  panel  shows  the  map  of 
the  non-normalized  diagonal  elements  of  B? .  Depth  contours  are  in  meters. 


8.3.3  Numerical  results 

To  assess  the  efficiency  of  the  methods  outlined  in  Sections  3. 1-3.2,  two  series  of  nu¬ 
merical  experiments  with  realistically  inhomogeneous  BEC  models  are  performed. 
In  the  first  series  the  methods  were  tested  in  the  2d  case  with  the  state  vector  having 
a  dimension  of  several  thousand.  In  the  second  series,  the  LHO  and  LH1  techniques 
are  examined  in  a  realistic  3d  setting  with  a  state  space  dimension  of  N  ~  106. 
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8.3.3.1  Experimental  setting  in  2d 

The  state  space  is  described  by  scalar  functions  defined  on  the  orthogonal  curvilin¬ 
ear  grid  of  the  Navy  Coastal  Ocean  Model  (NCOM)  [28]  set  up  in  the  Monterrey 
Bay  (Fig.  4).  The  number  N  of  grid  points  (dimension  of  the  state  space)  was  3,438. 
A  vector  field  u(x)  was  used  to  generate  the  diffusion  tensor  as  follows.  The  smaller 
principal  axis  A2  of  \Jv  is  set  to  be  orthogonal  to  u  with  the  corresponding  ’’back¬ 
ground”  length  scale  A2  =  35,  where  5 (x )  is  the  spatially  varying  grid  step.  The 
length  of  the  larger  axis  Ai  is  set  to  be  equal  to  max(l,  -y/ |m | / m) A2,  where  u  is  a 
prescribed  threshold  value  of  |u|.  If  u  is  a  velocity  field,  then  a  structure  like  this 
simulates  enhanced  diffusive  transport  of  model  errors  in  the  regions  of  strong  cur¬ 
rents  on  the  background  of  isotropic  error  diffusion  with  the  decorrelation  scale  A2. 

In  the  2d  experiments,  the  vector  field  u  is  generated  by  treating  bottom  topog¬ 
raphy  h{ x)  (Fig.  4)  as  a  stream  function.  The  threshold  value  v  was  taken  to  be 
one-fifth  of  the  rms  variation  of  |V/z|  over  the  domain. 

All  the  experiments  described  in  the  next  two  sections  are  performed  using  the 
BEC  models  (8.42-8.43)  with  the  parameters  n  =  m  =  2.  A  composite  map  of  five 
columns  of  B?  is  shown  in  Figure  4a.  The  diffusion  operator  (1)  is  constrained  to 
have  zero  normal  derivative  at  the  open  and  rigid  boundaries  of  the  domain  in  both 
2d  and  3d  experiments. 

Numerically,  the  action  of  Bg  on  a  state  vector  y0  was  evaluated  by  explicitly 
integrating  the  corresponding  diffusion  equation  yr  =  D /2y  for  the  virtual  ’’time  pe¬ 
riod”  defined  by  v,  starting  from  the  ’’initial  condition”  y0.  The  minimum  number  of 
’’time  steps”  required  for  the  scheme’s  stability  in  such  a  setting  was  5,256.  The  ac¬ 
tion  of  B2  was  computed  by  solving  the  system  of  equations  (I  —  D/4)2y  =  y0  with 
a  conjugate  gradient  method.  The  number  of  iterations,  required  for  obtaining  a  so¬ 
lution,  varied  within  2,000-2,500.  To  make  the  shapes  of  the  B„  and  B2  compatible 
(Fig.  1),  the  diffusion  tensor  in  B2  was  multiplied  by  8 / 7T  (see  Table  1). 

The  exact  values  d(x)  of  the  diagonal  elements  are  shown  in  Figure  4b.  Their 
magnitude  appears  to  be  lower  in  the  regions  of  ’’strong  currents”  (large  u),  as  the 
corresponding  5-functions  are  dispersed  over  larger  areas  by  diffusion.  d(x)  are 
higher  near  the  boundaries  because  part  of  the  domain  available  for  dispersion  is 
screened  by  the  condition  that  prescribes  zero  flux  across  either  open  or  rigid  bound¬ 
aries. 


8.3.3.2  Statistical  methods 

The  MC  method  is  implemented  in  two  ways:  In  the  first  series  of  experiments,  the 
components  of  Sk  are  taken  to  be  either  1  or  -1  with  equal  probability.  In  the  second 
series  they  are  drawn  from  the  white  noise  on  the  interval  [-1,  1].  The  residual  error 
e  is  computed  using  equation  (8.41).  In  both  series,  the  rates  of  reduction  of  e  with 
iteration  k  are  similar  and  closely  follow  the  \/k  law  (upper  gray  line  in  Fig.  5a). 
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To  improve  the  accuracy,  the  MC  estimates  are  low-pass  filtered  with  the  corre¬ 
sponding  B-operators  at  every  iteration  (Fig.  5b).  To  optimize  the  filter,  the  diffu¬ 
sion  operators  in  B¥ ,2  are  multiplied  by  the  tunable  parameter  y,  which  effectively 
reduced  the  mean  decorrelation  (smoothing)  scale  y  1  /2  times.  The  lower  lines  in 
Figure  5a  demonstrate  the  result  of  such  optimal  smoothing:  this  procedure  resulted 
in  an  almost  four-fold  reduction  of  the  domain-averaged  error  (e)  to  0.1  after  per¬ 
forming  60  iterations  (averaging  over  60  ensemble  members). 


Fig.  8.5  a)  reduction  of  the  domain-averaged  diagonal  estimation  error  (e)  with  iterations  for  th 
HM  (black)  and  MC  (gray)  methods  for  the  Bi  model.  The  lower  curves  are  obtained  after  optimal 
smoothing  of  the  estimates.  The  thin  horizontal  lines  show  the  error  levels  that  are  provided  by  the 
asymptotic  zeroth-  ((e)=0.17)  and  first-order  ((e)=0. 10)  methods  which  do  not  require  iterative 
schemes,  b)  Horizontal  distribution  of  e(B2)  after  60  iterations  of  the  HM  method  with  smoothing. 


Experiments  with  the  HM  method  (black  curves  in  Fig  5a)  show  that  horizontal 
smoothing  significantly  improves  the  accuracy  of  the  estimates,  especially  after  the 
first  few  dozens  of  iterations.  Comparison  with  the  MC  method  (gray  curves  in  Fig. 
5a)  demonstrates  a  noticeable  advantage  of  the  HM  technique  (black  curves),  which 
remains  visible  at  higher  iterations  k  >  100  even  after  smoothing  (lower  curves). 
This  advantage  increases  with  increasing  iterations  for  two  reasons:  The  HM  method 
converges  faster  than  k  1/2  by  its  nature,  whereas  the  efficiency  of  smoothing  (tar¬ 
geted  at  removing  the  small-scale  error  constituents)  degrades  as  the  signal-to-noise 
ratio  of  the  diagonal  estimates  increases  with  the  iteration  number  k. 

From  the  practical  point  of  view,  it  is  not  reasonable  to  do  more  than  several  hun¬ 
dred  iterations,  as  (e)  drops  to  the  value  of  a  few  per  cent  (Fig.  5a),  which  is  much 
smaller  than  the  accuracy  in  the  determination  of  the  background  error  variances.  It 
can  therefore  be  concluded  that  it  is  advantageous  to  use  the  HM  technique,  when 
making  more  than  a  hundred  iterations  is  computationally  affordable. 
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8.3.3.3  Asymptotic  expansion  method 

Since  the  principal  axes  of  the  diffusion  tensor  at  every  point  are  defined  by  con¬ 
struction,  computation  of  the  zeroth-order  approximation  (8.44)  to  the  normaliza¬ 
tion  factors  is  not  expensive.  Near  the  boundaries,  however,  the  factors  described  by 
(8.44)  have  to  be  adjusted  by  taking  into  account  the  geometric  constraints  imposed 
on  the  diffusion.  This  adjustment  was  computed  for  points  located  closer  than  3A] 
from  the  boundary  and  it  was  assumed  that  the  boundary  had  negligible  impact  on 
the  shape  of  the  diffused  5-function  [18]. 

Figure  6  demonstrates  the  horizontal  distribution  of  the  error  e(jc)  obtained  by 
approximating  the  diagonal  elements  of  B„  with  (8.44)  (zeroth-order  LH  method, 
or  LHO)  and  with  (8.51),  (the  first-order  LH  method  LH1).  Despite  an  apparent 
violation  of  the  LH  assumption  in  many  regions  (e.g.,  Ai  changes  from  205  to  the 
background  value  of  35  at  distances  L  ~  5  —  65  <  Ai  across  the  shelf  break),  the 
mean  approximation  error  of  the  diagonal  elements  appears  to  be  relatively  small 
(19%)  for  the  LHO  method,  with  most  of  the  maxima  confined  to  the  regions  of 
strong  inhomogeneity  (Fig.  6a).  The  next  approximation  (Fig.  6b)  reduces  (e)  to 
9%.  Numerical  experiments  with  the  B2  model  have  shown  similar  results  (16% 
and  10%  errors). 
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Fig.  8.6  Diagonal  approximation  errors  under  the  zeroth-order  (a),  and  first-order  (b)  LH  methods 
for  the  Bs  model.  The  thin  black  line  inside  the  boundaries  shows  the  domain  of  error  averaging. 


Another  series  of  experiments  are  performed  with  the  varying  scaling  parame¬ 
ter  y  to  find  an  optimal  fit  to  d.  Computations  were  made  for  0  <  y  <  1.  The  best 
result  for  B„  was  obtained  for  y2  =  0.30  which  is  fairly  consistent  with  the  value 
(y2  =  0.33)  given  by  (8.50).  In  the  case  of  the  B2  operator,  the  optimal  value  is 
y2  =  0.24,  still  in  a  reasonable  agreement  with  (8.50),  given  the  strong  inhomo¬ 
geneity  of  v  and  deviation  of  the  B2  operator  from  the  Gaussian  form.  A  somewhat 
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smaller  value  of  72  (Bi)  can  be  explained  by  the  sharper  shape  of  the  respective  cor¬ 
relation  function  at  the  origin  (Fig.  1),  which  renders  d°  to  be  less  dependent  on  the 
inhomogeneities  in  the  distribution  of  v,  and,  therefore,  requires  less  smoothing  in 
the  next  approximation. 


8.3.4  Numerical  efficiency 


Table  2  provides  an  overview  of  the  performance  for  the  tested  methods.  For  com¬ 
parison  purposes  we  show  CPU  requirements  by  the  smoothed  MC  and  HM  meth¬ 
ods  after  they  achieve  the  accuracies  of  the  LHO  and  LH1  methods.  It  is  seen  that 
both  MC  and  HM  methods  are  300-1000  times  more  computationally  expensive 
than  the  LH  technique.  In  fact,  for  the  2d  case  considered,  the  computational  cost  of 
the  stochastic  diagonal  estimation  method  is  similar  to  the  cost  of  the  3dvar  analysis 
itself,  which  required  several  hundred  iterations.  The  remarkable  CPU  saving  are 
due  to  the  fact  that  the  LH  methods  explicitly  take  into  account  information  on  the 
local  structure  of  B  which  can  be  derived  by  analytical  methods.  Comparison  of  the 

Table  8.2  Relative  CPU  times  required  by  the  MC  and  HM  methods  to  achieve  the  accuracies  (e) 
of  the  LHO  and  LH1  methods  (shown  in  brackets). 


MC/LH0 

MC/LH1 

HM/LH0 

HM/LH1 

B* 

755 

(■19) 

1205 

(.09) 

680 

(.19) 

520 

(.09) 

b2 

780 

(.17) 

490 

(.10) 

850 

(.17) 

330 

(.10) 

spatial  distributions  of  the  approximation  error  (e)(x)  (Fig.  5b,  6b)  favor  the  LH 
methods  as  well:  They  show  significantly  less  small-scale  variations  and  may  have 
a  potential  for  further  improvement.  Comparing  Fig.  5b  and  6b  also  shows  that, 
in  contrast  to  the  statistical  methods,  LHO  errors  tend  to  increase  in  the  regions  of 
strong  inhomogeneity,  but  decrease  substantially  after  smoothing  by  the  LH1  algo¬ 
rithm.  At  the  same  time,  the  LH1  errors  tend  to  have  relatively  higher  values  near  the 
boundaries.  The  effect  is  less  visible  in  the  HM  pattern  (Fig.  5b).  This  feature  can  be 
partly  attributed  to  certain  inaccuracy  in  estimation  of  the  near-boundary  elements. 
However,  there  is  certainly  room  for  further  improvement  with  the  issue. 
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8.3.5  LH  experiments  in  3d  setting 

To  check  the  performance  of  the  LHO  and  LH1  methods  further,  a  larger  3d  NCOM 
domain  is  set  up  in  the  Okinawa  region  (Fig.  7)  with  horizontal  resolution  of  10  km 
and  45  vertical  levels.  The  state  vector  dimension  N  (total  number  of  the  grid  points) 
in  this  setting  was  862,992. 

Because  of  the  large  N,  it  is  computationally  unfeasible  to  directly  compute  all 
the  diagonal  elements  of  the  BEC  matrix.  Therefore,  accuracy  checks  are  performed 
on  a  subset  of  10,000  points,  randomly  distributed  over  the  domain  and  the  value  of 
(e)  is  estimated  by  averaging  over  these  points. 


Fig.  8.7  Diagonal  elements  of  Bx  in  the  Okinawa  region  at  z  =  20 m.  The  actual  values  are  multi¬ 
plied  by  104. 


The  diffusion  tensor  is  constructed  in  the  same  way  that  is  described  in  Section 
3.1,  but  the  generating  field  u(x)  is  taken  to  be  the  horizontal  velocity  field  from  an 
NCOM  run.  The  value  of  A3  (in  the  vertical  direction)  is  independent  of  horizontal 
coordinates,  but  varies  in  the  vertical  as  3 Sz,  where  <S-  is  the  vertical  grid  step.  Figure 
7  illustrates  spatial  variability  of  the  Bs  diagonal  elements  at  3=  20  m.  The  smallest 
values  are  observed  in  the  regions  of  the  Kuroshio  and  the  North  Equatorial  Current, 
where  the  largest  velocities  are  observed,  and  the  Q  =  \J detv  reaches  its  largest 
values  (eq.  8.44).  To  better  test  the  algorithm,  a  relatively  small  threshold  value  of 
i  =.02  m/s  is  prescribed,  so  that  diffusion  is  anisotropic  in  more  than  90%  of  the  grid 
points. 
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Figure  8  demonstrates  the  accuracy  of  LHO  and  LH1  methods  in  such  setting: 
the  LHO  method  provides  an  accuracy  of  9%  which  is  further  improved  to  6%  by 
the  LH1  scheme.  The  major  improvement  occurs  in  the  regions  where  points  with 
highly  anisotropic  v  neighbor  isotropic  points  and  reduce  the  diagonal  elements  in 
the  latter.  The  effect  is  reflected  by  the  negative  bias  of  the  scatter  plot  at  high  values 
of  d°,  which  reaches  its  maximum  of  .0237  in  the  points  with  isotropic  v  (Fig.  8a). 


Fig.  8.8  Scatter  plots  of  the  true  diagonal  elements  of  Bs  (vertical  axis)  versus  their  approxima¬ 
tions  by  LHO  (a)  and  LH1  (b)  algorithms.  The  actual  values  are  multiplied  by  103.  Near-boundary 
points  are  excluded.  Right:  Diagonal  approximation  errors  as  a  function  of  y  for  the  Bj,  (black)  and 
B2  (gray)  models.  Dashed  line  shows  the  value  of  given  by  (8.50). 


Figure  8c  shows  the  dependence  of  approximation  error  e  on  the  value  of  75  for 
both  correlation  models.  The  best  approximation  is  obtained  at  73  =  0.26,  a  value 
somewhat  lower  than  suggested  by  the  heuristic  formula  (73=5/18=0.28,  dashed 
line).  Similarly  to  the  2d  case,  the  optimal  value  of  73 (Bi)  =  0.21  is  less  than 
which  is  in  agreement  with  the  more  rapid  off-diagonal  decay  of  the  B? 
matrix  elements. 

In  general,  it  appears  that  the  relationship  (8.50)  provides  a  reasonable  guidance 
to  the  estimation  of  the  smoothing  parameter  in  the  LH1  method.  For  the  Bg  model, 
the  operator  acting  on  d!j  can  be  implemented  by  either  reducing  the  number  of 
“time  steps”  in  integration  of  the  diffusion  equation  7  1  times,  or  by  7  '/2-fold 
reduction  of  the  decorrelation  radius.  For  the  Bi  model  only  the  second  option  is 
applicable:  it  also  reduces  the  number  of  iterations  required  for  computing  the  action 
of  the  Bi  due  to  the  decrease  of  the  condition  number. 


8.4  Summary  and  discussion 

BEC  modeling  with  the  diffusion  operator  is  an  efficient  and  flexible  tool  for  eval¬ 
uating  matrix-vector  products  of  large  dimension  which  emerge  in  minimization 
algorithms  of  variational  data  assimilation.  In  this  chapter,  we  discussed  two  ma- 
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jor  issues  associated  with  this  type  of  models:  construction  of  a  positive-definte 
smoothing  operator  B  as  a  rational  function  of  D  and  the  estimation  of  diagB. 

In  section  2  analytic  relationships  between  the  polynomial  coefficients  of  B  and 
the  parameters  controlling  the  shape  of  correlation  functions  were  derived.  Although 
only  homogeneous  operators  in  boundless  domains  were  considered,  these  relation¬ 
ships  provide  reasonable  guidance  to  constructing  more  realistic  BEC  operators, 
especially  in  cases  when  the  typical  scale  of  variability  of  the  diffusion  tensor  is 
much  larger  than  the  local  decorrelation  scale  pc  and/or  most  of  the  observations 
are  separated  from  the  boundaries  by  distances,  exceeding  pc.  In  a  similar  way, 
weak  inhomogeneity  can  be  introduced  by  variable  coefficients  zm{x),  and  the  local 
CF  shapes  can  be  assessed  using  (8.13),  (8.28-8.31). 

Similar  issues  have  been  recently  studied  by  many  authors  (e.g.,  [8,  10,  11, 
17]).  In  particular,  analytic  formulas  analogous  to  (8.33)-(8.35),  were  derived  in 
somewhat  different  setting  by  Hristopulos  and  Elogne  [10,  11]  whom  considered 
quadratic  polynomials  of  similar  structure.  Xu  [8]  analyzed  Taylor  expansions  of 
expB  and  obtained  recursive  relations  for  the  polynomial  coefficients  associated 
with  an  arbitrary  CF.  Mirouze  and  Weaver  [17]  demonstrated  a  possibility  to  gener¬ 
ate  oscillating  CFs  using  higher-order  polynomials  in  one  dimension. 

Relationships  (8.28-8.32)  generalize  these  results  for  the  polynomial  model  of  an 
arbitrary  order  M.  We  assume,  however,  that  the  inverse  quadratic  model  (M  =  1)  is 
of  major  practical  interest  for  two  reasons.  First,  the  BEC  operators  that  are  encoun¬ 
tered  in  GFD  applications  are  rarely  homogeneous  and  observational  statistics  are 
usually  insufficient  to  capture  the  details  of  the  spatial  variability  of  the  CFs.  There¬ 
fore,  experimental  estimates  of  the  BECs  are  either  limited  to  low-rank  ensemble 
estimates  or  have  to  rely  on  the  very  rough  assumption  of  homogeneity.  Needless 
to  say,  that  in  the  latter  case  the  structure  of  a  sample  CF  should  be  elaborated  with 
sufficiently  low  detailization  and  be  well  accounted  for  by  a  two-parameter  BEC 
model  (Fig.  2) .  The  second  reason  is  that  the  use  of  higher-order  polynomials  con¬ 
siderably  degrades  the  conditioning  of  the  linear  systems  that  are  being  solved  in 
the  assimilation  process  and,  therefore,  may  require  sophisticated  preconditioners. 

The  second  equally  important  aspect  of  the  d-operator  BEC  modeling  is  the  com¬ 
putational  efficiency  of  estimating  the  diagonal  elements  of  B.  Two  types  of  the  BEC 
operators  were  considered:  with  the  Gaussian-shaped  kernel  B„  and  with  the  kernel 
generated  by  the  second-order  binomial  approximation  to  Bg.  The  tested  techniques 
include  the  ’’stochastic”  MC  and  HM  methods,  which  retrieve  diagB  iteratively  from 
its  action  on  a  sequence  of  model  state  vectors,  and  the  ’’deterministic”  scheme 
based  on  the  analytic  diagonal  expansion  under  the  assumption  of  local  homogene¬ 
ity  of  the  diffusion  tensor.  The  deterministic  scheme  was  tested  in  two  regimes:  the 
zeroth  (LHO)  and  the  first-order  (LH1)  approximations. 

Numerical  experiments  conducted  with  realistic  diffusion  tensor  models  show 
that:  a)  the  HM  technique  proves  to  be  superior  in  efficiency  compared  to  the  MC 
technique  when  accuracies  of  less  than  10%  error  ( k  >  100)  are  required;  b)  both 
stochastic  methods  require  300-1000  times  more  CPU  time  to  achieve  the  accuracy, 
compatible  with  the  most  efficient  LH1  method;  c)  with  the  Gaussian  model,  the 
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LH1  method  demonstrates  the  best  performance  with  the  value  of  the  smoothing  pa¬ 
rameter  7  compatible  with  the  one  given  by  the  relationship  (8.50)  derived  from  the 
asymptotic  approximation  of  the  Gaussian  kernel  diagonal.  In  deriving  the  ansatz 
(8.51)  for  the  LH1  model,  we  followed  the  approach  of  Purser  et  al  [12],  whom 
proposed  to  smooth  the  zeroth-order  diagonal  by  the  square-root  of  the  BEC  opera¬ 
tor  in  the  one-dimensional  case.  Using  the  asymptotic  technique  for  the  heat  kernel 
expansion,  we  obtained  a  formula  for  higher  dimensions,  and  tested  its  validity  by 
numerical  experimentation. 

It  should  be  noted  that  the  formal  asymptotic  expansion  (8.45)  is  local  by  nature 
and  tends  to  diverge  in  practical  applications,  where  spatial  variations  of  the  diffu¬ 
sion  tensor  may  occur  at  distances  L  comparable  with  the  typical  decorrelation  scale 
A.  To  effectively  immunize  the  expansion  from  the  ill-effects  of  the  abrupt  changes 
in  v,  we  utilized  a  non-local  empirical  modification,  still  fully  consistent  with  the 
original  expansion  in  the  limit  A/L  — >  0,  but  sufficiently  robust  with  respect  to  the 
numerical  errors  related  to  the  high-order  derivatives  of  v.  A  similar  technique  was 
developed  by  Purser  [12,  13],  who  used  empirical  saturation  functions  to  stabilize 
higher-order  approximations  of  the  B„. 

In  general,  results  of  our  experiments  show  high  computational  efficiency  of  the 
LH1  scheme,  whose  total  CPU  requirements  is  just  a  fraction  of  the  CPU  time  re¬ 
quired  by  the  convolution  with  the  BEC  operator  -  a  negligible  amount  compared 
to  the  cost  of  a  3dVar  analysis.  Therefore,  LH1  approximations  to  the  BEC  diag¬ 
onal  may  serve  as  an  efficient  tool  for  renormalization  of  the  correlation  operators 
in  variational  data  assimilation,  as  they  are  capable  of  providing  accuracy  to  3-10% 
error  in  realistically  inhomogeneous  BEC  models. 

A  separate  question,  that  requires  further  investigation,  is  the  accurate  treatment 
of  the  boundary  conditions.  In  the  present  study  we  assumed  that  boundaries  affect 
only  the  magnitude  of  the  corresponding  columns  of  B,  but  not  their  structure.  This 
approximation  is  only  partly  consistent  with  the  zero  normal  flux  conditions  for  d, 
but  can  be  avoided  if  one  uses  ’’transparent”  boundary  conditions  (e.g.  [17])  which 
do  not  require  computation  of  the  adjustment  factors.  On  the  other  hand,  it  might  be 
beneficial  to  keep  physical  (no-flux)  boundary  conditions  in  the  formulation  of  d,  as 
they  are  likely  to  bring  more  realism  to  the  dynamics  of  the  BE  field. 

Another  important  issue  is  parameterization  of  v(x)  using  the  background  fields 
and  their  statistics.  In  the  simple  diffusion  tensor  model  used  in  the  experiments, 
anisotropic  BE  propagation  is  governed  by  the  background  velocity  field  and  su¬ 
perimposed  on  the  small-scale  isotropic  BE  diffusion,  which  takes  place  at  scales 
that  are  not  well  resolved  by  the  grid  (less  than  38).  More  sophisticated  parameter- 
izations  of  v(x)  are  surely  possible  and  require  further  studies.  In  particular,  recent 
studies  have  shown  that  since  v(x)  has  only  n(n  +  l)/2  independent  components,  it 
can  be  estimated  from  ensembles  of  moderate  (~  lOOn)  size  with  reasonable  accu¬ 
racy  [29,  30,  31,  32].  Finally,  the  considered  BEC  models  could  also  be  effectively 
used  for  adaptive/flow-dependent  covariance  localization  [33,  34],  which  is  an  is¬ 
sue  of  crucial  importance  in  improving  the  forecast  skill  of  the  state-of-the-art  data 
assimilation  systems. 
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Appendix  1 

Let  9  be  the  angle  between  x  and  k  in  B"  and  n  >  2.  Then  the  integral  (8.5)  can  be 
rewritten  in  spherical  coordinates  as 


B"(r)  =  (27r)“"/ B{k)  I  exp(-ikrcos9)k"~1dk  dQn-h  (8.53) 

0  £S„_, 

where  d Qn  |  is  the  element  of  the  surface  area  of  the  unit  sphere.  Since  cos  6 
changes  symmetrically  within  the  limits  of  integration,  the  imaginary  part  of  the 
exponent  vanishes.  Furthermore,  using  the  identity  d Q„ _  |  =  d£2n-  n  ■  sin"  2  6d6, 
the  integral  (8.53)  can  be  rewritten  as 


°o  n 

B"(r)  =  (2n)~n  J B(k)k”~ldk  J  dQ.n-2  j cos{krcos  9)  sin"-2 9d0  (8.54) 

o  n„_2  o 

Integration  over  9  and  substitution  of  the  formula  for  the  surface  of  (n  —  2)- 
dimensional  unit  sphere  into  (8.54)  yields  (8.6). 

The  general  relationship  (8.6)  also  holds  for  n=  1,2  although  these  cases  require 
a  special  (less  complicated)  treatment. 


Appendix  2 


In  practice,  the  matrix  elements  of  the  operator  (8.10)  are  never  calculated  explicitly 
due  to  the  immense  cost  of  such  a  computation.  Instead,  the  result  x'"{x)  of  the 
action  by  B  on  a  (discrete)  model  state  vector  X  (a)  is  calculated  by  solving  the 
linear  system  of  equations 


(l-b/2m)mxm=X°,  (8.55) 

where  D  denotes  the  discretized  diffusion  operator.  If  X°(x)  represents  the  ’’initial 
state”  and  the  “time  step”  St  is  prescribed  such  that  the  “integration  time”  is  mdt  = 
1,  then  action  of  the  operator  (8.55)  can  be  identified  as  a  result  of  a  discrete-time 
integration  of  the  diffusion  equation  dtX  =  D/2X  with  the  implicit  scheme 
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—  St  D x', 


(8.56) 


starting  from  the  initial  state  X°.  Here  i  denotes  the  time  step  number. 

Similarly,  the  action  of  exp(D/2)  is  never  computed  by  convolving  a  state  vector 
x  with  the  discretized  kernel  (8.42),  but  rather  by  the  discrete-time  integration  of 
the  diffusion  equation  with  the  explicit  numerical  scheme 

x' -x'^1  =  Igf  Dxi_1,  i  =  (8.57) 

such  that 

Xm=(l  +  D/2™)  x°  (8.58) 

in  correspondence  with  the  asymptotic  relation  (8.9)  for  the  Gaussian  kernel  Bg. 


Appendix  3 


By  definition,  a  Hadamard  matrix  (HM)  is  a  square  matrix  whose  entries  are  either 
1  or  -1  and  whose  columns  are  mutually  orthogonal.  The  simplest  way  to  construct 
HMs  is  the  recursive  Sylvester  algorithm  which  is  based  on  the  obvious  property:  if 
Hn  is  an  N  x  N  Hadamard  matrix,  then 


Hi  N 


Hv  Hn  ' 

Hn-Hn 


is  also  an  HM.  Starting  from  Hi  =  [  1  1 ;  1  — 1  ] ,  the  HMs  with  order  N  =  2” ,  n  =  1 , 2 . . . 
can  be  easily  constructed.  HMs  with  N  =  12,20  were  constructed  ’’manually”  more 
than  a  century  ago.  A  more  general  HM  construction  algorithm,  which  employs  the 
Galois  fields  theory,  was  found  in  1933.  In  the  present  study  we  used  the  MatLab 
software  that  only  handles  the  cases  when  M/12,  or  M/20  is  a  power  of  2.  Despite 
this  restriction,  the  available  values  of  M  were  sufficient  for  purposes  of  this  chapter. 
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